๐Ÿฟ๏ธ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
๐Ÿ“ Document Chunking

Semantic Segmentation, Context Preservation, Retrieval Optimization, Text Processing

Why Your Chunking Strategy Makes or Breaks Your AI System
medium.comยท4dยท
Discuss: Hacker News
๐Ÿ“„Text Chunking
Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance
arxiv.orgยท10h
๐Ÿค–Advanced OCR
Clustering News Articles for Topic Detection: A Technical Deep Dive
dev.toยท3dยท
Discuss: DEV
๐Ÿ“šDocument Clustering
davidchisnall/igk: I got Knuth'd: A compiler for documents
github.comยท8h
๐Ÿ“Concrete Syntax
Kumo Surfaces Structured Data Patterns Generative AI Misses
thenewstack.ioยท21m
๐Ÿ“ŠGraph Databases
Why Your Next LLM Might Not Have A Tokenizer
towardsdatascience.comยท19h
๐Ÿค–Grammar Induction
Markov-Enhanced Clustering for Long Document Summarization: Tackling the 'Lost in the Middle' Challenge with Large Language Models
arxiv.orgยท1d
๐Ÿ“„Text Chunking
New: Improve Apache Iceberg query performance in Amazon S3 with sort and z-order compaction
aws.amazon.comยท17h
๐Ÿ”„Burrows-Wheeler
Portable Network Graphics (PNG) Specification (Third Edition)
w3.orgยท17hยท
Discuss: Hacker News
๐Ÿ•ธ๏ธWebP Analysis
Text2Struct: A Machine Learning Pipeline for Mining Structured Data from Text
arxiv.orgยท1d
๐Ÿ”คCharacter Classification
Machine Learning Fundamentals: active learning
dev.toยท22hยท
Discuss: DEV
๐Ÿค–Grammar Induction
Practical tips to optimize documentation for LLMs, AI agents, and chatbots
biel.aiยท19hยท
Discuss: Hacker News
๐Ÿค–Archive Automation
MemeMind: A Large-Scale Multimodal Dataset with Chain-of-Thought Reasoning for Harmful Meme Detection
arxiv.orgยท10h
๐ŸงฎVector Embeddings
The modern text processing pipeline: Overview
newroadoldway.comยท1dยท
Discuss: Lobsters, r/programming
๐Ÿ”คUnicode Normalization
PDF Retrieval Augmented Question Answering
arxiv.orgยท1d
๐Ÿ“ŠMulti-vector RAG
ByteSpan: Information-Driven Subword Tokenisation
arxiv.orgยท1d
๐Ÿ’พBinary Linguistics
What LLMs Know About Their Users
schneier.comยท3h
๐Ÿ’ปLocal LLMs
BPCLIP: A Bottom-up Image Quality Assessment from Distortion to Semantics Based on CLIP
arxiv.orgยท1d
๐Ÿ–ผ๏ธJPEG XL
Recurrent Visual Feature Extraction and Stereo Attentions for CT Report Generation
arxiv.orgยท10h
๐Ÿค–Advanced OCR
LLMs for Customized Marketing Content Generation and Evaluation at Scale
arxiv.orgยท1d
๐Ÿ“ŠFeed Optimization
Loading...Loading more...
AboutBlogChangelogRoadmap